Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms Ecole Normale Supérieure De Lyon Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms

نویسندگان

  • Pierre-Yves Calland
  • Alain Darte
  • Yves Robert
چکیده

In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as many parallel loops as the intrinsic degree of parallelism of the nest authorizes, while avoiding a full memory expansion. We try to reduce the number of temporary arrays that we introduce, as well as their dimension. Dans ce rapport, nous pr esentons une rapide synth ese des techniques uselles pour l' elimination des \fausses d ependances" (anti-d ependances et d ependances en sortie). Ces techniques requi erent le plus souvent l'emploi de tableaux auxiliaires. Nous montrons comment int egrer ces techniques dans les algorithmes de parall elisation de boucles (comme celui d'Allen et Kennedy). Notre objectif est d'obtenir autant de boucles parall eles que le programme original en contient potentiellement, tout en evi-tant une expansion m emoire compl ete. Nous tentons de r eduire le nombre de tableaux auxiliaires introduits, ainsi que leur dimension. Abstract In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as many parallel loops as the intrinsic degree of parallelism of the nest authorizes, while avoiding a full memory expansion. We try to reduce the number of temporary arrays that we introduce, as well as their dimension.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Plugging Anti and Output Dependence Removal Techniques into Loop Parallelization Algorithms

In this paper we shortly survey some loop transformation techniques which break anti or output dependences, or artiicial cycles involving such \false" dependences. These false dependences are removed through the introduction of temporary buuer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy's algorithm). The goal is to extract as...

متن کامل

Plugging Anti and Output Dependence Removal Techniques Into Loop Parallelization Algorithm

In this paper we shortly survey some loop transformation techniques which break anti or output dependence& or artificial cycles involving such ‘false’ dependences. These false dependences are removed through the introduction of temporary buffer arrays. Next we show how to plug these techniques into loop parallelization algorithms (such as Allen and Kennedy’s algorithm). The goal is to extract a...

متن کامل

Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling Ecole Normale Supérieure De Lyon Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling

Tiling is a technique used for exploitingmedium grain parallelism in nested loops It relies on a rst step that detects sets of permutable nested loops All algorithms developed so far consider the statements of the loop body as a single block in other words they are not able to take advantage of the structure of dependences between di erent statements In this report we overcome this limitation b...

متن کامل

The Bouclettes Loop Parallelizer Ecole Normale Supérieure De Lyon the Bouclettes Loop Parallelizer

Bouclettes is a source to source loop nest parallelizer It takes as input Fortran uniform perfectly nested loops and gives as output an HPF High Performance Fortran program with data distribution and parallel HPF INDEPENDENT loops This paper presents the tool and the underlying parallelization methodology

متن کامل

Combining Retiming and Scheduling Techniques for Loop Parallelization and Loop Tiling

Tiling is a technique used for exploiting medium-grain parallelism in nested loops. It relies on a rst step that detects sets of permutable nested loops. All algorithms developed so far consider the statements of the loop body as a single block, in other words, they are not able to take advantage of the structure of dependences between diierent statements. In this paper, we overcome this limita...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996